linking run_tdnn.sh (see https://groups.google.com/d/msg/kaldi-help/UAKh81Oapyw/etHsBG13BAAJ)#3056
linking run_tdnn.sh (see https://groups.google.com/d/msg/kaldi-help/UAKh81Oapyw/etHsBG13BAAJ)#3056jyhnnhyj wants to merge 9 commits intokaldi-asr:masterfrom jyhnnhyj:master
Conversation
|
OK, but you need to add the script that it points to as well. I'll wait till you have WERs for that, though. |
|
sure, started with a clean setup - will update the numbers once all done
(running on a small-ish machine with a single GPU only - so will take a
while)
about linking, run_tdnn.sh -> tuning/run_tdnn_1a.sh
I just added run_tdnn.sh which points to this script:
tuning/run_tdnn_1a.sh and it's already there - so nothing is missing as
far as I can see
~/kaldi/egs/tedlium/s5_r3/local/chain$ ls -la
total 16
drwxr-xr-x 3 morrie morrie 4096 Feb 26 11:08 .
drwxr-xr-x 5 morrie morrie 4096 Feb 26 11:07 ..
-rwxr-xr-x 1 morrie morrie 3334 Feb 26 11:07 compare_wer_general.sh
lrwxrwxrwx 1 morrie morrie 21 Feb 26 11:07 run_tdnnf.sh ->
tuning/run_tdnn_1b.sh
lrwxrwxrwx 1 morrie morrie 21 Feb 26 11:08 run_tdnn.sh ->
tuning/run_tdnn_1a.sh
drwxr-xr-x 2 morrie morrie 4096 Feb 26 11:08 tuning
…On Tue, 26 Feb 2019 at 19:15, Daniel Povey ***@***.***> wrote:
OK, but you need to add the script that it points to as well. I'll wait
till you have WERs for that, though.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#3056 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/Atyo_RvM5fu4LxmybDITYzOnskhrEA7eks5vRXmtgaJpZM4bR7ik>
.
|
|
No, that script run_tdnn_1a.sh is not the right one, it is an older version
of the broken one in run_tdnnf.sh.
You should take a script from s5_r2.
Run with the number of jobs in the script, and --use-gpu wait, so it will
use only one GPU but give the same results.
…On Tue, Feb 26, 2019 at 2:23 PM jyhnnhyj ***@***.***> wrote:
sure, started with a clean setup - will update the numbers once all done
(running on a small-ish machine with a single GPU only - so will take a
while)
about linking, run_tdnn.sh -> tuning/run_tdnn_1a.sh
I just added run_tdnn.sh which points to this script:
tuning/run_tdnn_1a.sh and it's already there - so nothing is missing as
far as I can see
~/kaldi/egs/tedlium/s5_r3/local/chain$ ls -la
total 16
drwxr-xr-x 3 morrie morrie 4096 Feb 26 11:08 .
drwxr-xr-x 5 morrie morrie 4096 Feb 26 11:07 ..
-rwxr-xr-x 1 morrie morrie 3334 Feb 26 11:07 compare_wer_general.sh
lrwxrwxrwx 1 morrie morrie 21 Feb 26 11:07 run_tdnnf.sh ->
tuning/run_tdnn_1b.sh
lrwxrwxrwx 1 morrie morrie 21 Feb 26 11:08 run_tdnn.sh ->
tuning/run_tdnn_1a.sh
drwxr-xr-x 2 morrie morrie 4096 Feb 26 11:08 tuning
On Tue, 26 Feb 2019 at 19:15, Daniel Povey ***@***.***>
wrote:
> OK, but you need to add the script that it points to as well. I'll wait
> till you have WERs for that, though.
>
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub
> <#3056 (comment)>,
or mute
> the thread
> <
https://github.com/notifications/unsubscribe-auth/Atyo_RvM5fu4LxmybDITYzOnskhrEA7eks5vRXmtgaJpZM4bR7ik
>
> .
>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#3056 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu1lmBe6h4XFy2V0ZuCo1D_I1BxVIks5vRYmVgaJpZM4bR7ik>
.
|
|
okay I see - sure, will use that file
will update the results here once done
…On Tue, 26 Feb 2019 at 20:36, Daniel Povey ***@***.***> wrote:
No, that script run_tdnn_1a.sh is not the right one, it is an older version
of the broken one in run_tdnnf.sh.
You should take a script from s5_r2.
Run with the number of jobs in the script, and --use-gpu wait, so it will
use only one GPU but give the same results.
On Tue, Feb 26, 2019 at 2:23 PM jyhnnhyj ***@***.***> wrote:
> sure, started with a clean setup - will update the numbers once all done
> (running on a small-ish machine with a single GPU only - so will take a
> while)
>
> about linking, run_tdnn.sh -> tuning/run_tdnn_1a.sh
> I just added run_tdnn.sh which points to this script:
> tuning/run_tdnn_1a.sh and it's already there - so nothing is missing as
> far as I can see
>
> ~/kaldi/egs/tedlium/s5_r3/local/chain$ ls -la
> total 16
> drwxr-xr-x 3 morrie morrie 4096 Feb 26 11:08 .
> drwxr-xr-x 5 morrie morrie 4096 Feb 26 11:07 ..
> -rwxr-xr-x 1 morrie morrie 3334 Feb 26 11:07 compare_wer_general.sh
> lrwxrwxrwx 1 morrie morrie 21 Feb 26 11:07 run_tdnnf.sh ->
> tuning/run_tdnn_1b.sh
> lrwxrwxrwx 1 morrie morrie 21 Feb 26 11:08 run_tdnn.sh ->
> tuning/run_tdnn_1a.sh
> drwxr-xr-x 2 morrie morrie 4096 Feb 26 11:08 tuning
>
>
> On Tue, 26 Feb 2019 at 19:15, Daniel Povey ***@***.***>
> wrote:
>
> > OK, but you need to add the script that it points to as well. I'll wait
> > till you have WERs for that, though.
> >
> > —
> > You are receiving this because you authored the thread.
> > Reply to this email directly, view it on GitHub
> > <#3056 (comment)>,
> or mute
> > the thread
> > <
>
https://github.com/notifications/unsubscribe-auth/Atyo_RvM5fu4LxmybDITYzOnskhrEA7eks5vRXmtgaJpZM4bR7ik
> >
> > .
> >
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> <#3056 (comment)>,
or mute
> the thread
> <
https://github.com/notifications/unsubscribe-auth/ADJVu1lmBe6h4XFy2V0ZuCo1D_I1BxVIks5vRYmVgaJpZM4bR7ik
>
> .
>
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#3056 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/Atyo_YFyPnN4Fp_UR8lEcj_c4pEhvVcfks5vRYzMgaJpZM4bR7ik>
.
|
|
a couple of questions: If I skip that step and change which seems a mismatch in params, I can try to fix these, but just wanted to double check if I should be using a different set of scripts... |
|
There are two problems here. Secondly, that run_tdnn.sh script, if just copied from s5_r2, may not be fully compatible with the setup. You will have to remove --min-seg-len option; and do a diff with the existing 'run_tdnn.sh' script and try to figure out which differences have to do with things like a change in the directory setup of tdnn s5_r3 vs. s5_r2, or other local changes, and apply those as needed. |
|
re Vimal's fix, I merged with it and can confirm it solves the problem. |
|
Any update?
…On Fri, Mar 1, 2019 at 6:45 AM jyhnnhyj ***@***.***> wrote:
re Vimal's fix, I merged with it and can confirm it solves the problem.
(re Python3, that's right, but during Kaldi setup, it creates a link for
Python2.7 and was expecting to pick that one) - but anyway ,this is now
solved. I'll continue with the rest and update the progress/issues here.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#3056 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu3Vr810ZPj1FRDoVtO2RBu19b0UHks5vSRLxgaJpZM4bR7ik>
.
|
|
sorry for not updating earlier, was distracted by a couple of other deadlines, plan to resume this on Monday - already run the run_cleanup_segmentation.sh last week, worked as expected and would start training on Monday |
|
a quick update that I just started running the training script |
|
so the ivector part ran successfully - but after that step, when it was validating the files, it fails |
|
Just remove the _comb part of the filename, it is something that used to
exist in older recipes, that we removed.
…On Wed, Mar 13, 2019 at 6:12 AM jyhnnhyj ***@***.***> wrote:
so the ivector part ran successfully - but after that step, when it was
validating the files, it fails
local/chain/run_tdnn.sh: expected file
data/train_cleaned_sp_hires_comb/feats.scp to exist
I tried to understand what this _comb thing is about - but couldn't trace
where it should have been created - any suggestions?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#3056 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu51O-rBY6HaygSXzYhD9UhR2IkTqks5vWM8lgaJpZM4bR7ik>
.
|
|
so after those comb changes, it progressed and now failed complaining about config files I checked librispeech and noticed how the new xconfig file is dumped |
|
I told you to take the script from tedlium s5_r2! I think you might be
getting it from tedlium s5!
…On Thu, Mar 14, 2019 at 8:38 AM jyhnnhyj ***@***.***> wrote:
so after those comb changes, it progressed and now failed complaining
about config files
here is the error message:
Traceback (most recent call last):
File "steps/nnet3/chain/train.py", line 625, in main
train(args, run_opts)
File "steps/nnet3/chain/train.py", line 302, in train
variables = common_train_lib.parse_generic_config_vars_file(var_file)
File "steps/libs/nnet3/train/common.py", line 352, in parse_generic_config_vars_file
"i.e. xconfig_to_configs.py.".format(field_value))
Exception: You have num_hidden_layers=7 (real meaning: your config files are intended to do discriminative pretraining). Since Kaldi 5.2, this is no longer supported --> use newer config-creation scripts, i.e. xconfig_to_configs.py.
I checked librispeech and noticed how the new xconfig file is dumped
also another one in tedlium local/chain/run_tdnnf.sh
which network config I should use?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#3056 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVuwl06UIvwjc36U5tGQtlnsIQOaJzks5vWkLagaJpZM4bR7ik>
.
|
|
you're right - I'm sorry, somehow when trying to fix some of the issues I re-used that old script |
|
a quick update about the training progress, it's in iteration 90/227 |
|
all done, here are the results: not sure how are these comparable to the previous results? so the new results seems be slightly worse than tdnn1g_sp, but better than tdnn1f_sp_bi ? |
|
OK. It's still better than the results in the current run_tdnnf.sh, but
not by as much as I had hoped.
Please show the output of steps/info/chain_dir_info.pl exp/chain/tdnn1g_sp
And also you should be calling this 1c, and naming the script
run_tdnn_1c.sh.
But that's not urgent right now. Please make sure the script in in your
PR, I want to have a look and see that
everything looks right.
…On Mon, Mar 18, 2019 at 6:08 AM jyhnnhyj ***@***.***> wrote:
all done, here are the results:
dev: %WER 8.03 [ 1428 / 17783, 255 ins, 274 del, 899 sub ]
dev_rescore: %WER 7.44 [ 1323 / 17783, 242 ins, 267 del, 814 sub ]
test: %WER 10.11 [ 2780 / 27500, 252 ins, 1083 del, 1445 sub ]
test_rescore: %WER 7.85 [ 2158 / 27500, 323 ins, 560 del, 1275 sub ]
not sure how are these comparable to the previous results?
In the header of the run_tdnn.sh, I can see there:
# System tdnn1f_sp_bi tdnn1g_sp
# WER on dev(orig) 8.9 7.9
# WER on dev(rescored) 8.1 7.3
# WER on test(orig) 9.1 8.0
# WER on test(rescored) 8.6 7.6
so the new results seems be slightly worse than tdnn1g_sp, but better than
tdnn1f_sp_bi ?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#3056 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu6ZenOEzjNo9mQFpUQw1dJPUGPUrks5vX2WvgaJpZM4bR7ik>
.
|
|
sure |
|
made the changes, but somehow messed up my kaldi fork, had to delete it and now can't push to this anymore |
|
OK, we will discuss on #3149. |
Dan's suggestion at https://groups.google.com/d/msg/kaldi-help/UAKh81Oapyw/etHsBG13BAAJ